Let's be honest: An optimal no-regret framework for zero-sum games
نویسندگان
چکیده
We revisit the problem of solving two-player zerosum games in the decentralized setting. We propose a simple algorithmic framework that simultaneously achieves the best rates for honest regret as well as adversarial regret, and in addition resolves the open problem of removing the logarithmic terms in convergence to the value of the game. We achieve this goal in three steps. First, we provide a novel analysis of the optimistic mirror descent (OMD), showing that it can be modified to guarantee fast convergence for both honest regret and value of the game, when the players are playing collaboratively. Second, we propose a new algorithm, dubbed as robust optimistic mirror descent (ROMD), which attains optimal adversarial regret without knowing the time horizon beforehand. Finally, we propose a simple signaling scheme, which enables us to bridge OMD and ROMD to achieve the best of both worlds. Numerical examples are presented to support our theoretical claims and show that our non-adaptive ROMD algorithm can be competitive to OMD with adaptive step-size selection.
منابع مشابه
A TRANSITION FROM TWO-PERSON ZERO-SUM GAMES TO COOPERATIVE GAMES WITH FUZZY PAYOFFS
In this paper, we deal with games with fuzzy payoffs. We proved that players who are playing a zero-sum game with fuzzy payoffs against Nature are able to increase their joint payoff, and hence their individual payoffs by cooperating. It is shown that, a cooperative game with the fuzzy characteristic function can be constructed via the optimal game values of the zero-sum games with fuzzy payoff...
متن کاملRegret Minimization in Behaviorally-Constrained Zero-Sum Games
No-regret learning has emerged as a powerful tool for solving extensive-form games. This was facilitated by the counterfactual-regret minimization (CFR) framework, which relies on the instantiation of regret minimizers for simplexes at each information set of the game. We use an instantiation of the CFR framework to develop algorithms for solving behaviorally-constrained (and, as a special case...
متن کاملReduced Space and Faster Convergence in Imperfect-Information Games via Regret-Based Pruning
Counterfactual Regret Minimization (CFR) is the most popular iterative algorithm for solving zero-sum imperfect-information games. Regret-Based Pruning (RBP) is an improvement that allows poorly-performing actions to be temporarily pruned, thus speeding up CFR. We introduce Total RBP, a new form of RBP that reduces the space requirements of CFR as actions are pruned. We prove that in zero-sum g...
متن کاملRegret Minimization in Non-Zero-Sum Games with Applications to Building Champion Multiplayer Computer Poker Agents
In two-player zero-sum games, if both players minimize their average external regret, then the average of the strategy profiles converges to a Nash equilibrium. For n-player general-sum games, however, theoretical guarantees for regret minimization are less understood. Nonetheless, Counterfactual Regret Minimization (CFR), a popular regret minimization algorithm for extensiveform games, has gen...
متن کاملHonest Signaling in Zero-Sum Games Is Hard, and Lying Is Even Harder
We prove that, assuming the exponential time hypothesis, finding an ǫ-approximately optimal symmetric signaling scheme in a two-player zero-sum game requires quasi-polynomial time (n). This is tight by [6] and resolves an open question of Dughmi [10]. We also prove that finding a multiplicative approximation is NP-hard.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.04221 شماره
صفحات -
تاریخ انتشار 2018